BIG DATA SECURITY




Faculty Mentor:
Dr. C Komalavalli

Student Name:
Rahel Williams (MCA-II)
Ayush Thomas (MCA-II)



1. INTRODUCTION:

The Internet has revolutionized the way IT industries are growing day by day. Generally, the size of data on which these industries are working is petabyte and exabyte. Big data has been playing a game changer for most of the industries. By analyzing the Big data with the help of technologies, better decisions can be made for the development of the industry. Security and privacy are major terms as far as big data is concerned. Security issues must be taken into consideration for proper analysis of data. Big data tools and technologies can be applied in the healthcare industry. Securing healthcare data also will be a big issue. In this article, big data lifecycle, security issues and tips to secure healthcare data is discussed.

2. WHAT’S BIG DATA?

BIG DATA is a term used to refer to this enormous amount of data that is complex in its own way that the traditional database management systems and software techniques are unable to process it. Some of its applications are in fields of healthcare, traffic management, banking, media, and entertainment etc.



Fig 1: Characteristics of Big Data.

Volume: It refers to the vast amount of data which is generated every second, minute, hour. Within social media data is generated by websites and their databases. It can be in kilobytes or terabytes. It can be saved in records and files.

Variety: Refers to the heterogeneous collection of data. Data can be in the form of images, text, voice etc.

Velocity: Refers to the speed with which data is generated. Every day 900 million photos are uploaded on Facebook, 500 million tweets are posted on Twitter, 0.4 million hours of videos are uploaded on Youtube and 3.5 billion searches are performed on Google.

3. SECURE BIG DATA:

Security must not be confused with privacy. Security solely means the protection against unauthorized access, integrity, and availability of data. It also means protecting data from attackers and from data stealers. Whereas, privacy means to protect one’s sensitive information from being accessed by a third party organization.

4. WHY IS SECURITY IMPORTANT?

1. Security prevents loss of information.
2. Security prevents malware infections and stops stealing of information.
3. Ensuring that any physical damage should not be there on the servers.
4. Ensuring that there are no programming errors to prevent corruption of files.
5. Ensuring that there are no legal problems due to the loss of information.

5. HEALTHCARE AS A MAJOR APPLICATION OF BIG DATA:

In the Healthcare industry, big data techniques can be applied intensively. Big data along with IOT(Internet Of Things) has helped in health tracking, predict outcomes of medical tests, reduce hospital charges, maintaining Electronic Health Record, pattern analysis in common diseases etc. Big data analytics in healthcare promises to be of great benefit yet it presents much of barriers and challenges.

6. SECURING BIG HEALTHCARE DATA:

As we all know healthcare organizations store, maintain and transfer a huge amount of their records to support the proper analysis of data to generate reports of patients. The healthcare industry still continues to be at a really high-risk rate towards security breaches. Attackers can use mining technologies to make the sensitive data public and that’s how the breach happens. Accordingly, it has become critical that organizations must implement security solutions.



Fig2: Big Data lifecycle.

6.1 Data Collection :

It involves the collection of information from various sources and in different formats. It is therefore important that data comes from trusted sources. Security measures must be taken to protect data from unauthorized access, loss, and misuse.

6.2 Data Transformation :

After the data has been collected the second step is to filter out the necessary data and to transform it into a meaningful data set. Sensitive information must not be transformed. Transformations are guided by access control mechanisms.

6.3 Data Modelling :

Once the data has been transformed, its analysis is done for its modeling. Various data mining techniques such as clustering, classification, and association can be employed for this. Modeling interprets the data for research, science and business decisions.

6.4 Knowledge Creation:

The modeling phase comes up with a knowledgeable dataset which can be used by decision makers. This final information guides and helps in to achieve the goal.

Here are a few tips to help you guard data against security breaches:

1. ENCRYPTION:

Through encryption, a data set can be manipulated in such a way so that only trusted authorized parties who know the decoding schemes can access this data. Thus it helps in to secure data such as packet sniffing and theft of storage device.

2. SECURE DATA STORAGE:

The data which is generated has to be stored at a proper repository and thus managing storage is a crucial task. For this auto-tiering can be used, enabling us to manage and store data at different storage levels.

3. PERIMETER BASED SECURITY:

Validation and Filtration of data at all entry and exit points in the system must be done to ensure that data comes in and goes to a trusted source and destination respectively.

4. GRANULAR AUDITING :

Can help to determine when missed attacks occurred and can help in their detection that why they were not noticed in the first place. Granular auditing is a continuous process and must be performed periodically.

5. PREVENT INSIDE THREATS:

Any company can be exposed to internal security risks. This may be the case when the employees are not loyal to your organization or they are simply unaware of the security practices. It is important to give digital security workshops to the employees so that they don’t compromise with sensitive information to the attackers.

6. ACCESS CONTROL AND ROOT ACCESS:

Users of the data must seek permissions from the administrator to view and edit the data. This calls for verification and validation of the user so that they have permissions such as transfer of data, job submission etc.

7. DATA PROVENANCE:

It describes the origin of data, the process of its development and its history. For security purposes, one must be very clear of the source from where his data originated.

8. MONITOR AND AUDIT:

Monitoring is simply catching the intrusions whereas, Audit means to record and analyze the user activities maintaining a log of every change made in data. Tools for both analysis and monitoring must be deployed in every organization so as to provide immediate action whenever an intrusion from any third party is detected.

7. CONCLUSION:

The opportunities offered by BIG DATA are unlimited. But technical challenges such as privacy and security issues are to be taken into consideration. In this article, security issues of big data in the healthcare sector and tips for securing big data are discussed. It is important to maintain security, integrity and data access control specified throughout the lifecycle.

8. REFERENCES :

1. https://www.gasystems.com.au/database-security-important/
2. http://dataconomy.com/2017/07/10-challenges-big-data-security-privacy/
3. https://www.fingent.com/blog/5-ways-big-data-is-changing-the-healthcare-industry
4. https://www.datapine.com/blog/big-data-examples-in-healthcare/
5. https://en.wikipedia.org/wiki/File:Hilbert_InfoGrowth.png
6. http://dataconomy.com/2017/07/10-challenges-big-data-security-privacy/
7. http://www.ijstr.org/final-print/mar2017/Improving-Healthcare-Using-Big-Data-Analytics.pdf